Gene knockout method based on base editing and its application

ABSTRACT

Gene knockout method based on base editing and its application is provided. The gene knockout method comprises: selecting a 20 bp-NGG target sequence of the coding region of the gene to be knocked out, so that it contains a complete target codon CAA, CAG or CGA; and using sgRNA sequence to locate BE3 to the target sequence, to convert the target single-base C of the target codon into T and thus introduce a corresponding termination codon TAA, TAG or TGA in order to realize the knockout, wherein the target single-base C is located preferably on site 4-8 in the target sequence, the interval between the target codon and NGG is 12 to 14 bp, and the upstream base(H) near the target codon cannot be G; and the sgRNA sequence is a 20 bp sequence complementary to the target sequence.

TECHNICAL FIELD

The invention relates to a gene knockout strategy. More specifically, it involves a gene knockout method based on base editing and its application.

BACKGROUND

Traditional eukaryotic targeting gene manipulation is achieved through homologous recombination and blastocyst injection of universal embryonic stem cells. Due to the limitation of the establishment of all-around embryonic stem cells, gene targeting transformation is completed mainly on mice (also reported on rats) through the homologous recombination of the universal embryonic stem cells [Capecchi, 2005]. Another approach to targeted gene editing is cloning, that is, the genetic modification and nuclear transplantation of somatic cells. However, there are some defects in the cloning technology [Carter et al., 2002; Zhu et al., 2004]. For example: 1. It is very difficult to differentiate into undifferentiated cells completely after the clone of somatic cells, which affects the development of embryos and causes developmental defects; 2. All genetic materials are mother-sourced only; 3. Low success rate is detected. Therefore, the traditional gene targeting technique restricts the gene knockout.

Programmable endonuclease technologies include zinc-finger nucleases (ZENs), transcription activator-like effectors nucleases (TALENs) and clustered regulated interspaced short palindromic repeat, CRISPR-associated (CRISPR/Cas9) [Kim and Kim, 2014]. The invention and promotion of such technologies broke the limits of the universal embryonic stem cells, making it possible for different species to operate. Especially, the CRISPR/Cas9 systems, due to its convenience, efficiency and low-cost, swept the world immediately after its appearance, became the latest, fastest developing and most widely used technologyin the area of gene editing, and caused a revolution in the field of gene editing accordingly. Nowadays, CRISPR/Cas9 has been successfully used for DNA knockout, knockin, DNA substitution, DNA modification, RNA modification, DNA markers, gene transcription regulation, etc. [Hsu et al., 2014; Komor et al., 2017]. It has been applied to gene editing of multiple species successfully [Barrangou R & Doudna J A, 2016; Komor et al., 2017].

CRISPR/Cas9 mediated gene specificity editing is based on sgRNA (single guided RNA) to locate the shear double-stranded DNA under the guidance of the target sequence complementary Cas9 protein, in order to trigger double-strand breaks, (DSB): in the condition of no template accessed, non-homologous end joining (NHEJ) repair and frameshift mutations should be caused, leading to the knockout; In the case of a template, the homology-directed repair (HDR) can be triggered by the homologous recombination, leading to the knock-in [Hsu et al., 2014; Kim and Kim, 2014; Komor et al., 2017]. Due to the low efficiency of HDR (integration is rare), and easy generation of random insertion and deletion (indel) under the homologous end joint mechanism, it would lead to the introduction of new base near breaking point and result in an inaccurate gene editing. In addition, CRISPR/Cas9 mediated gene editing has some off-target effects [Gorski et al., 2017].

SUMMARY OF THE INVENTION

One of the aims of this invention is to provide an efficient and accurate gene knockout strategy.

According to the latest study, the Cas9 fusion protein based on CRISPR/Cas9 technology can be used as the Base Editor (BE). These fusion proteins include dCas9 or Cas9 incision enzyme and cytosine deaminase APOBEC1, which converts cytosine (C) into uracil through deamination, without cutting DNA. Then, through DNA replication or repair, uracil is converted to thymine (T). Similarly, it can convert a single base G into A. In particular, the BE3 made of Cas9 incision enzyme and APOBEC1 can significantly increase the efficiency of base editing to 15-75%. Because no DNA cutting is required to cause DSB, the indel formed is less than 1%, and the gene editing is more accurate [Komor et al., 2016]. Moreover, this approach reduces the off-target effect to 10 times less than the natural background, and the genetic editing is more secure [Nishida et al., 2017]. BE3 has been successfully used in in vivo base editing to achieve CT mutation in mice, with an efficiency of 44^(˜)57% [Kim et al., 2017].

Based on the above BE mediated single-base mutation, especially on the accuracy and specificity of BE3 mediated single-base editing, the inventors design a gene knockout strategy: introducing termination codon by CT mutations, such as by having CAA, CAG and CGA mutated to a termination codon TAA, TAG or TGA, or by having TGG mutated into a termination codon TAA, TGA or TAG through GA mutation, in order to terminate the encoding gene translation and realize a gene knockout.

According to the first aspect of the invention, a gene knockout method is provided, which includes:

-   -   selecting a 20 bp-NGG target sequence (PAM sequence) of the         coding region (CDS) of the gene to be knocked out, so that it         contains a a complete target codon CAA, CAG or CGA; and using         sgRNA sequence to locate BE3 to the target sequence, to convert         the target single-base C of the target codon into T and thus         introduce a corresponding termination codon TAA, TAG or TGA in         order to realize the knockout,     -   wherein the target single-base C is located on site 1-8 in the         sequence (from left side), preferably on site 4 to 8, the         interval between the target codon and NGG is 12 to 14 bp,         preferably 14 bp, and the upstream base near the target codon         cannot be G; and the sgRNA sequence is a 20 bp sequence         complementary to the target sequence.

Alternatively, in the method mentioned above, a CCN-20 bp target sequence (PAM) of the coding region of the gene to be knocked out can also be selected to include a complete target codon TGG, and the downstream base (D) near the target codon cannot be C. Accordingly, the target single base G is located on site 1-8 (at the right end) of the target sequence, preferably on site 4-8, and the interval of the target codon and CCN is 12-14 bp, preferably 14 bp.

According to this invention, BE3 may be selected from the group consisting of:

-   rAPOBEC1-SaCas9-NLS-UGI-NLS; 3xUGI-rAPOBEC1-SaCas9-NLS-UGI-NLS; -   rAPOBEC1-SpCas9-NLS-UGI-NLS; and 3xUGI-rAPOBEC1-SpCas9-NLS-UGI-NLS,     preferably from the group consisting of the last two.

According to the present invention, the method can be used to knock out the following eight target genes: human PD1, LAG3, TIGIT, VISTA, 2B4 and CD160, and mouse TIM3 and LAG3. The corresponding sgRNA sequence is complementary with the gene sequences shown in target Sequence one to eight, respectively.

According to the second aspect of the present invention, an application of the method conducted on human PD1, LAG3, TIGIT, VISTA, 2B4 and CD160 gene knockout in cell line HEK293T is provided.

According to the third aspect of the present invention, an application of the method conducted on human PD1, LAG3, TIGIT, VISTA, 2B4 and CD160 gene knockout in human T cells is provided.

According to the fourth aspect of the present invention, separated T cells or cell lines or their subcultures according to the above applications are provided.

According to the fifth aspect of the present invention, a kit for gene knockout including sgRNA (corresponding to the gene to be knoced out), BE3 and corresponding amplification reagents.

Based on base editing techniques developed on CRISPR/Cas9, this invention establishes a more efficient and accurate as well as less off-target gene knockout strategy than CRISPR/Cas9 by creating termination codons through accurate CT or GA single-base mutation.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustratively shows knocking out the target gene by CT mutation according to the present invention; and

FIGS. 2-5 illustratively show different BE3 structures.

EMBODIMENTS

Firstly, different BE3 was constructed, as shown in FIGS. 2-5. After the fusion of different Cas9 nickase and cytosine deaminase (APOBEC1), the following four BE3 were formed:

-   (1) rapobec1-sacas9-nls-ugi-nls, FIG. 2, Sequence 1; -   (2) 3 xugi-rAPOBEC1-SaCas9-NLS-the UGI-NLS, FIG. 3, Sequence 2; -   (3) rAPOBEC1 SpCas9-NLS-the UGI-NLS, FIG. 4, Sequence 3; -   (4) 3 xugi-rAPOBEC1-SpCas9-NLS-the UGI-NLS, FIG. 5, Sequence 4.     In the following gene knockout, any of the above BE3 can be used,     preferably (3) or (4).

Next, in the design of sgRNA, the base point editing is to use sgRNA to locate BE3 or target it to the specific sites. The key of the invention is the selection and design of target gene specific sgRNA. The present invention selects and designs sgRNA as below: selecting a 20 bp-NGG target sequence (PAM sequence) of the coding region of the gene to be knocked out, such that it includes a complete target codon CAA, CAG or CGA;

the target single base C is preferably located on site 4-8 (in the left end) of the target sequence, the interval between the target codon and NGG is preferably 14 bp, and the upstream base (H) near the target codon cannot be G; preparing a sequence of 20 bp sgRNA complementary to the target sequence.

Alternatively, a CCN-20 bp target sequence (PAM) of the coding region of the gene to be knocked out is selected to include a complete target codon TGG, and the downstream base (D) near the target codon cannot be C. Accordingly, the target single base G is preferably located on site 4-8 (at the right end) of the target sequence, and the interval of the target codon and CCN is preferably 14 bp.

With respct to the 8 different target genes (human PD1, LAG3, TIGIT, VISTA, 2B4 and CD160, and mouse TIM3 and LAG3), the following target gene sequences are selected to design the corresponding sgRNA according to the present invention (the bold and underlined portions represent PAMs; and the italic and underlined portions represent the candidate mutation codes):

1. hPD-1 Sg-1: CTA CAA CTGGGCTGGCGGCC

Sg-2: CAG CAA CCAGACGGACAAGC

Sg-3: CGGC CAG TTCCAAACCCTGG

2. hLAG3 Sg-1:

GACCATAGGAGAGATG TGG G Sg-2:

TAGGAGAGATG TGG GAGGCT Sg-3:

GCGGCGCCCTCCTCC TGG GG 3. hTIGIT Sg-1: GAT CGA GTGGCCCCAGGTCC

4. hVISTA Sg-1:

TCTACAAGACG TGG TACCGC 5. 2B4 Sg-1: GCAGCT CAG CAGCAGGACAG

6. hCD160 Sg-1: AAAA CAG CTGAGACTTAAAA

7. mTIM3 Sg-1:

CGTGCCCGTCTGC TGG GGCA 8. mLAG3 Sg-1:

GACCATAGGAGAGATG TGG

For the above-selected target gene sequences, human PD1 (3), LAG3 (3), TIGIT, VISTA, 2B4 and CD160, and mouse TIM3, LAG3, corresponding sgRNA expression vectors are built and different sgRNAs are imported into pGL3-U6-sgRNA respectively.

Example 1

In the cell line, BE3 mediated base editing is performed, and the termination codon was introduced to realize gene knockout. The knockout of the cell lines is operated regularly (through electrotransfection or liposome transfection), taking liposome transfection for example.

-   (1) Taking HEK293T cells for example, eukaryotic cells are trained     and transfected according to the present invention: HEK293T cells     are inoculated and cultured in DMEM sugar culture added by 10% FBS     (HyClone, 5H30022.01 B), including penicillin (100 U/ml) and     streptomycin (100 μg/ml). -   (2) Distributing it into a 6-well plate before transfection which is     conducted under a density of 70%-80%. -   (3) Taking liposome transfection for example. According to manual     operation of the Lipofectamine™ 2000 Transfection Reagent     (Invitrogen, 11668-019), and taking SpCas9 nickase for example, 2 μg     BE3 plasmid and 2 μg pGL3-U6-sgRNA plasmid are evenly blended, and     cotransfected into each well. The solution is changed every six to     eight hours, and cells are collected after 72 hours. -   (4) Analysis of genotype -   A, collecting some cells in the pyrolysis liquid (10μM Tris HCl, 0.4     M NaCl, 2 μM EDTA, 1% SDS) and digesting them by using 100 μg/ml     proteinase K. After digestion, phenol-chloroform extraction is made     and then dissolved into 50 μl deionized water. -   B, using a pair of primers N-For and N-Rev for PCR amplification.     The PCR recovery product is collected by AxyPrep PCR cleanup     purification. 200 ng product is diluted into 20 μl for degeneration     and annealing: 95° C., 5 min; 95-85° C. at −2° C./s; 85-25° C. at     0.1° C./s; Hold at 4° C. -   C, A base, adenine (A), is added at the end of the PCR recovery     product by rTaq. The adenine-added reaction system comprises:     -   700-800 ng PCR recovery product     -   5 μl 10× I Buffer (Mg′ PLUS)     -   4 μl dNTP     -   0.5 μl rTaq (TAKARA, R001 AM)     -   adding water to obtaion a 50 μl system.

After incubation under 37° C. for 30 minutes, 1 μl product is removed and connected with pMD19-T vector (TAKARA, 3271) to transformate DH5 cells (TransGen, CD201).

-   D, monoclone is selected and the target gene mutation is sequenced     with universal primer M13-F. The sequencing results are shown below     (the bold and underlined portions represent PAMs; the italics     represent mutant codes; and the italic and underlined portions     represent mutant bases):

1. hPD-1 Sg-1: CTACAACTGGGCTGGCGGCC

Mut: CTA

AACTGGGCTGGCGGCC

Sg-2: CAGCAACCAGACGGACAAGC

Mut: CAG

AACCAGACGGACAAGC

Sg-3: CGGCCAGTTCCAAACCCTGG

Mut: CGGC T AGTTCCAAACCCTGG

2. hLAG3 Sg-1:

GACCATAGGAGAGATGTGGG Mut:

GACCATAGGAGAGATGTG

G Sg-2:

TAGGAGAGATGTGGGAGGCT Mut:

TAGGAGAGATGTG

GAGGCT Sg-3:

GCGGCGCCCTCCTCCTGGGG Mut:

GCGGCGCCCTCCTCCTG

GG 3. hTIGIT Sg-1: GATCGAGTGGCCCCAGGTCC

Mut: GAT

GAGTGGCCCCAGGTCC

4. hVISTA Sg-1:

TCTACAAGACGTGGTACCGC Mut:

TCTACAAGACGTG

TACCGC 5. 2B4 Sg-1: GCAGCTCAGCAGCAGGACAG

Mut: GCAGCT

AGCAGCAGGACAG

6. hCD160 Sg-1: AAAACAGCTGAGACTTAAAA

Mut: AAAA

AGCTGAGACTTAAAA

The results show that the target genes result in the target base mutation of sgRNA, the termination codons are introduced, and the gene knockouts of PD1, LAG3, TIGIT, VISTA, 2B4 and CD160 are achieved successfully.

Example 2

In the primary cells, BE3 mediated base editing was conducted, and the termination codon was introduced to realize gene knockout.

The gene knockout of the primitive cells in human T cells is operated regularly (through electrotransfection or liposome transfection), taking electrotransfection for example.

(1) the separation and purification of PBMC cells:

-   A, using an anticoagulant tube to collect peripheral blood, with the     tube being shaken during the collection so as to have the peripheral     blood fully mixed with the anticoagulant; -   B, mixing peripheral blood cells and lymphocyte separation medium     with equal volume, performing centrifugation, and draining white     membrane layer of cells obtained after centrifugation; -   C, mixing the white membrane layer of cells with PBS or serum-free     cell culture medium 1640 and then performing centrifugation. The     precipitation is the PBMC cells.     Repeat three times.     (2) enrichment of the CD3 positive cells -   A, adjusting PBMC cells concentration to 50×10⁶ cell/ml; -   B, adding 50 μl CD3+enriched antibodies cocktail per ml, blending     and then standing at room temperature for 5 minutes; -   C, adding 150 μl per ml magnet, blending and then standing for 10     minutes at room temperature; -   D, placing the centrifugal tube on a magnetic rack and standing for     5 minutes, then draining upper cell suspension into a new 15 ml     centrifugal tube. -   E, repeat the operation once. -   F, performing centrifugation: 300*g at room temperature for 10     minutes and then collecting cells. -   G, cell counting.     (3) electrotransfection of CD3 positive cells -   A, configure the electrotransfection system Adding 8μ g BE3 plasmid     and 8 μg pGL3-U6-sgRNA plasmid to a 1.5 ml centrifugal tube, then     adding 82 μl electrotransfection buffer and 18 μl supplement1     according to the Lonza Amaxa DianZhuan kit specifications, and     mixing evenly. -   B, collecting 20×10⁶ cells to a 15 ml centrifugal tube, centrifuging     300 g for 10 minutes, and discarding the supernatant. -   C, resuspending cells with the electrotransfection solution made     from A, and transferring it to an electrotransfection cup. -   D, Using the instrument Lonza 2B, U-014 procedures for     electrotransfection. -   E, after electrotransfection, the cells being removed to a preheated     AIM-V medium added with 10% FBS quickly, and incubated in 5% CO′     incubator at 37° C. for 2 hours. -   F, changing the solution for cells after electrotransfection,     resuspending the cells with 1×10⁶/ml cell density, and incubating     overnight.     (4) the activation and cultivation of T cells -   A, after 24 hours for electrotransfection, adding 100 U/ml IL-2 to     the medium, adding CD3/CD28 dynabeads to the proportion of 1:1, and     activating T cells. -   B, every two days changing half in liquid to the cells, or adding     IL-2, so that the cell density is always maintained at 1×106/ml. -   C, after being activated for five days the T cells being collected     in a 15 ml centrifugal tube which is positioned in a magnetic frame.     Slowly removing supernatant to another clean 15 ml centrifugal tube.     Repeat this step once. -   D, centrifuging 300*g at room temperature for 10 minutes, removing     supernatant, and resuspending cells by using 10% FBS, 300 U/ml     AIM-IL-2 V medium, with density controlled in 1×106/ml. -   E, every two days changing half in liquid to the cells, or adding     IL-2, and counting, with cell density always maintained at 1×106/ml.     (5) analysis of genotype -   A, collecting some cells in the pyrolysis liquid (10 μM Tris HCl,     0.4 M NaCl, 2 μM EDTA, 1% SDS), and digesting them by using 100     μg/ml proteinase K. After digestion, phenol-chloroform extraction is     made and then dissolved into 50 μl deionized water. -   B, using a pair of primers N-For and N-Rev for PCR amplification.     The PCR recovery product is collected by AxyPrep PCR cleanup     purification. 200 ng product is diluted to 20 μl for degeneration     and annealing: 95° C., 5 min. 95-85° C. at −2° C./s; 85-25° C. at     0.1° C./s; Hold at 4° C. -   C, A base, adenine (A), is added at the end of the PCR recovery     product by rTaq. The adenine-added reaction system compromises:     -   700-800 ng PCR recovery product     -   5 μl 10× Buffer (Mg²⁺ +PLUS)     -   4 μl dNTP     -   0.5 μl rTaq (TAKARA, R001 AM)     -   adding water to obtaion a 50 μl system.

After incubation under 37° C. for 30 minutes, 1 μl product is removed and connected with pMD19-T vector (TAKARA, 3271) to transform DH5 competent cells (TransGen, CD201).

-   D, monoclone is selected, and sequence each T cells target gene     mutations with universal primers M13-f. The sequencing results are     shown as below (the bold and underlined portions represent PAMs; the     italics represent mutant codes; and the italic and underlined     portions represent mutant bases):

1. hPD-1 Sg-1: CTACAACTGGGCTGGCGGCC

Mut: CTA

AACTGGGCTGGCGGCC

Sg-2: CAGCAACCAGACGGACAAGC

Mut: CAG

AACCAGACGGACAAGC

Sg-3: CGGCCAGTTCCAAACCCTGG

Mut: CGGC T AGTTCCAAACCCTGG

2. hLAG3 Sg-1:

GACCATAGGAGAGATGTGGG Mut:

GACCATAGGAGAGATGTG

G Sg-2:

TAGGAGAGATGTGGGAGGCT Mut:

TAGGAGAGATGTG

GAGGCT Sg-3:

GCGGCGCCCTCCTCCTGGGG Mut:

GCGGCGCCCTCCTCCTG

GG 3. hTIGIT Sg-1: GATCGAGTGGCCCCAGGTCC

Mut: GAT

GAGTGGCCCCAGGTCC

4. hVISTA Sg-1:

TCTACAAGACGTGGTACCGC Mut:

TCTACAAGACGT

GTACCGC 5. 2B4 Sg-1: GCAGCTCAGCAGCAGGACAG

Mut: GCAGCT

AGCAGCAGGACAG

6. hCD160 Sg-1: AAAACAGCTGAGACTTAAAA

Mut: AAAA

AGCTGAGACTTAAAA

The results show that the target genes result in the target base mutation of sgRNA, the termination codons are introduced, and the gene knockouts of PD1, LAG3, TIGIT, VISTA, 2B4 and CD160 are achieved successfully.

Example 3 Developing BE3 Mediated Gene Knockout Mice

Conducting regular operation on mice embryo collection, microinjection of embryo, embryo culture and embryo transfer, etc. For example, the mice were knocked out of TIM3 and LAG3 genes.

-   (1) Microinjection: the fertilized egg was injected by BE3 mRNA and     TIM3 specificity sgRNA (corresponding to above Sequence 7), or by     BE3 mRNA and LAG3 specificity sgRNA (corresponding to above     Sequence 8) respectively. Conventional embryo transfer was then     conducted; -   (2) genotype analysis:     -   genomic DNA is extracted by regular mice tail-cutting, and the         coding regions are PCR amplified respectively. Sanger         sequencing, with the sequencing result shown as below (the bold         and underlined portions represent PAMs; the italics represent         mutant codes; and the italic and underlined portions represent         mutant bases):

7. mTIM3 Sg-1:

CGTGCCCGTCTGCTGGGGCA Mut:

CGTGCCCGTCTGCT A GGGCA 8. mLAG3 Sg-1:

GACCATAGGAGAGATGTGG Mut:

GACCATAGGAGAGATGTG A

The above results demonstrate CT mutation of TIM3 and LAG3 and introduction of termination codon. TIM3 and LAG3 knockout mice have been successfully developed. 

1. A gene knockout method for knocking out a gene comprising: selecting a 20 bp-NGG target sequence of the coding region of the gene to be knocked out, such that it contains a complete target codon CAA, CAG or CGA; and using sgRNA sequence to position BE3 in the target sequence to convert the target single-base C of the target codon into T, in order to introduce a corresponding termination codon TAA, TAG or TGA for realization of the gene knockout, wherein the target single-base C is located between site 1 to 8 of the target sequence; the interval of the target codon and NGG is 12 to 14 bp; the upstream base near the target condon cannot be G; and the sgRNA sequence is a 20 bp sequence complementary to the target sequence.
 2. The method of claim 1, wherein BE3 is selected from the group consisting of: rAPOBEC1-SaCas9-NLS-UGI-NLS, 3xUGI-rAPOBEC1-SaCas9-NLS-UGI-NLS, rAPOBEC1-SpCas9-NLS-UGI-NLS and 3xUGI-rAPOBEC1-SpCas9-NLS-UGI-NLS.
 3. The method of claim 1, wherein the gene to be knocked out is selected from the group consisting of: Human PD1, LAG3, TIGIT, VISTA, 2B4, CD16; mouse TIM3 and LAG3; and wherein the corresponding sgRNA sequences are complementary to target sequences as shown in SEQ 1 to SEQ
 8. 4. The method of claim 1, wherein the genes knocked out are from the HEK293T cell lines and are selected from the group consisting of human PD1, LAG3, TIGIT, VISTA, 2B4, and CD16.
 5. The method of claim 1, wherein the genes knocked out are from human T cells and are selected from the group consisting of in the gene knockout of human PD1, LAG3, TIGIT, VISTA, 2B4, CD16.
 6. Cells produced by the method of claim
 4. 7. A kit for gene knockout comprising sgRNA and BE3 as recited in claim 1, and amplification reagents.
 8. Cells produced by the method of claim
 5. 9. A kit for gene knockout comprising sgRNA and BE3 as recited in claim 2, and amplification reagents. 